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Abstract 

Via a simulation study we compare the finite sample performance of 
the deconvolution kernel density estimator in the supersniooth decon- 
volution problem to its asymptotic behaviour predicted by two asymp- 
totic normality theorems. Our results indicate that for lower noise 
levels and moderate sample sizes the match between the asymptotic 
theory and the finite sample performance of the estimator is not satis- 
factory. On the other hand we show that the two approaches produce 
reasonably close results for higher noise levels. These observations in 
turn provide additional motivation for the study of deconvolution prob- 
lems under the assumption that the error term variance cr^ ^ as the 
sample size n — s- cx). 

Keywords: finite sample behavior, asymptotic normality, deconvolu- 
tion kernel density estimator, Fast Fourier Transform. 
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1 Introduction 



Let Xi, . . . , Xn be i.i.d. observations, where Xi = Yi + Zi and the y's and 
Z's are independent. Assume that the y's are unobservable and that they 
have the density / and also that the Z's have a known density k. The 
deconvolution problem consists in estimation of the density / based on the 
sample Xi, . . . , Xn ■ 

A popular estimator of / is the deconvolution kernel density estimator, 
which is constructed via Fourier inversion and kernel smoothing. Let w be 
a kernel function and h > a bandwidth. The kernel deconvolution density 
estimator fnh is defined as 



fnh{x) = ^ 



-itx 



{ht)(j)emp{t) 



dt 



1 " 

— Y 

nh ^ 



X 



X, 



(1) 



where 



^emp 



denotes the empirical characteristic function of the sample, i.e. 



-'emp 



n 



and (j)k are Fourier transforms of the functions w and A:, respectively, and 



Wh{X) 



1 

2^ 



-itx 



<t>w{t) 



dt. 



The e stimator ([T|) was proposed in Carroll and Halfl ( 19881 ) and Stefanski and Carroll 
and there is a vast amoun t of literature dedicated to i t (for additional 



biblio graphic information see e.g. Ivan Es and Uh I (j2004l ) and Ivan Es and Uh 

diooi)). 



Depending on the rate of decay of the characteristic function 0^ at plus 
and minus infinity, deconvolution problems are usually divided into two 
groups, ordinary smooth deconvolution problems and supersmooth deconvo- 
lution problems. In the first case it is assumed that (pk decays algebraically 
and in the second case the decay is essentially exponential. This rate of 
decay, and consequently the smoothness of the density k, has a decisive in- 
fluence on the performance of ([T]). The general picture that o ne sees is tha t 
smoother k is, the harder the estimation of / becomes, see e.g. lFanl(jl991al ). 
Asy r aptotic normality of (HI) in the ordinar y smooth case was established 



m 



Fan ! (|l991bl l. see also iFan and Liu I (|l997l l. The limit behaviour in this 



case is essentially the same as that of a kernel estimator of a higher order 
derivative of a density. This is obvious in certain relatively simple cases 
where the estimator is actually equal to the sum of derivatives of a kernel 
density estimator, cf. van Es and Kok ( 19981 ). 

Our main interest, however, lies in asymptotic normality of ([T]) in the 
supersmooth case. In this case under certain conditions on th e kernel w a nd 
the unknown density /, the following theorem was proved in Fan I (1992). 
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Theorem 1.1. Let fnh be defined by ([I]). Then 



^(/n/.(a:)-E[/„,(x)])^AA(0,l) 



(2) 



as n 



oo. Here either = / n) "^^^^ Z^j , or s,^ is the sample variance 

of Znl, . . . , Znn with Znj = {l/h)wh{{x - Xj)/h). 

The asymp totic variance of f nh itse lf does no t fohow from this result . On 
the other hand van Es and Uh see also Ivan Es and Uh I (|2005l ). de- 

rived a central limit theorem for ([T]) where the normalisation is deterministic 
and the asymptotic variance is given. 

For the purposes o f the present work it is sufficient to use the result of 
Es and Uhl jiooi). However, before recalling the corresponding theo- 
rem, we first formulate conditions on the kernel w and the density k. 

Condition 1.1. Let 0^ he real-valued, symmetric and have support [—1,1]. 
Let 0«,(O) = 1, and assume i;^>^(l — t) = At"' + o{t°') as t I for some 
constants A and q > 0. 

The simplest example of such a kernel is the sine kernel 

sinx 



w{x) 



TTX 



(3) 



Its characteristic function equals (pw{t) = l[_i^i](t). In this case A = 1 and 
a = 0. 

Another kernel satisfying Condition 11.11 is 



wix) 



48 cos X 



TTX^ 



1 



15 



144 sin X 



(4) 



Its corresponding Fourier transform is given by 4>w{t) = (1 ~ t ) l[-i 



Here A = 8 and a = 3. The kernel was used for simulations in I Fan 



(119921) and its good performance in deconvolution context was established 



m 



Delaigle and Hall I (loOQ). 



Yet another example is 



w{x) = - 



3 f sin(x/4) 
x/4 



(5) 



The corresponding Fourier transform equals 

Mt) = 2(1 - |t|)'l[l/2,l](|i|) + (6|t|' - + l)l[_i/2,l/2](t). 

Here A = 2 and a = 3. This kernel was considered in Wand I ( 19981 ) and 



Delaigle and Hall I tood ). 



Now we formulate the condition on the density k. 
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Condition 1.2. Assume that (pk{t) ~ C|t|^"exp as \t\ — > cxD, for 

some A > > 0, Aq and some constant C. Furthermore, let (j)k{t) 7^ for 
all t G M. 



The following theorem holds true, see Ivan Es and Uh I hm± 



Theorem 1.2. Assume Conditions \l.l\ and \l.^ and let'E[X'^] < 00. Then, 
as n ^ 00 and h ^ 0, 



{fnh{x)-^ [fnh{x)]) ^ (0, ^ (^)'^'" (r(a + 1))^) 



/,,A(l+a)+Ao-lel/(Ai/i^) 

Here T denotes the gamma function. 

The goal of the present note is to compare the theoretical behaviour of 
the estimator ([T]) predicted by Theorem 11.21 to its behaviour in practice, 
which will be done via a limited simulation study. The obtained results 
can be used to compare Theorem 11.11 to Theorem 11.21 G.g. whether it is 
preferable to use the sample standard deviation s„ in the construction of 
pointwise confidence intervals (computation of s„ is more involved) or to 
use the normalisation of Theorem 11.21 (this involves evaluation of a simpler 
expression). The rest of the paper is organised as follows: in Section [2] we 
present some simulation results, while in Section [3] we discuss the obtained 
results and draw conclusions. 



2 Simulation results 

All the simulations in this section were done in Mathematica. We considered 
three target densities. These densities are: 

1. density # 1: Y ~ AA(0, 1); 

2. density # 2: Y ~ x'(3); 

3. density # 3: Y ~ 0.6AA(-2, 1) + 0.4A^(2, 0.8^). 

The density # 2 was chosen because it is skewed, while the density ^ 3 
was selected because it has two unequal modes. We also assumed that the 
noise term Z was A^(0, 0.4^) distributed. Notice that the noise-to-signal 
ratio NSR = Var[Z]/ Var[y]100% for the density # 1 equals 16%, for the 
density 7^ 2 it is equal to 2.66%, and for the density # 3 it is given by 3%. 
We have chosen the sample size n = 50 and generated 500 samples from 
the density q = f * k. No t ice th at such n was also used in simulations in 
e.g. IPelaide and Gijbels I (I200A . Even though at the first sight n = 50 



might look too small for normal deconvolution, for the low noise level that 
we have the deco nvolution kernel density estimator will still perform well, 
cf. Wand ( 19981 ). As a kernel we took the kernel ([H). For each model 
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that we considered, the theoreticahy optimal bandwidth, i.e. the bandwidth 
minimising 



MISE[/„;,] = E 



ifnhix) - fix)fdx 



(6) 



the mean-squared error of the estimator fnh-, was selected by evaluating ([6]) 
for a grid of values of = O.Olfc, k = 1, . . . , 100, and selecting the h that 
minimised MISE[/„/j] on that grid. Notice that it is easier to evaluate ^ by 
rewriting it in terms of the characteristic functions , which can be done via 
Parseval's identity, cf. IStefanski and Carroll 1(1993). For real data of course 



the above met hod does not work, b ecaus e ([6]) depends on the unknown 
/. We refer to IPelaigle and Giibels I hmi \ for data-dependent bandwidth 
selection methods in kernel deconvolution. 

(|2007l ^. in order 



Following the recommendation of lDelaigle and Gijbels 



to avoid possible numerical issues, the Fast Fourier Transform was used to 
evaluate the estimate ([I]). Several outcomes for two sample sizes, n = 50 
and n = 100, are given in Figure [H We see that the fit in general is quite 
reasonable. This is in line with results in Wand ( 19981 ). where it was shown 
by finite sample calculations that the deconvolution kernel density estimator 
performs well even in the supersmooth noise distribution case, if the noise 
level is not too high. 

In Figure [2] we provide histograms of estimates fnhix) that we obtained 
from our simulations for x = and x = 0.92 (the densities # 1 and ^ 2) 
and for x = and x = 2.04 (the density 7^ 3). For the density ^ 1 points 
X = and x = 0.92 were selected because the first corresponds to its mode, 
while the second comes from the region where the value of the density is 
moderately high. Notice that x = is a boundary point for the support of 
density # 2 and that the derivative of density # 2 is infinite there. For the 
density # 3 the point x = corresponds to the region between its two modes, 
while X = 2.04 is close to where it has one of its modes. The histograms 
look satisfactory and indicate that the asymptotic normality is not an issue. 



Our main interest, however, is in comparison of the sample standard 
deviation of ([T]) at a fixed point x to the theoretical standard deviation 
computed using Theorem II. 2[ This is of practical importance e.g. for con- 
struction of confidence intervals. The theoretical standard deviation can be 
evaluated as 



TSD 



ATia + l)h^ 



upon noticing that in our case, i.e. when using kernel dH) and the error 
distribution AA(0,0.42), we have ^ = 8, a = 3, Aq = 0, A = 2, = 2/0.4^. 
After comparing this theoretical value to the sample standard deviation of 
the estimator fnh at points x = and x = 0.92 (the densities ^ 1 and # 2) 
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Figure 1: The estimate fnh (dotted line) and the true density / (thin hne) 
for the densities 7^ 1, # 2 and # 3. The left column gives results for n = 50, 
while the right column provides results for n = 100. 



and at points x = and x = 2.04 (the density # 3), see Tabled! we notice 
a considerable discrepancy (by a factor 10 for the density # 1 and even 
larger discrepancy for densities 7^ 2 and #3). At the same time the sample 
means evaluated at these two points are close to the true values of the target 
density and broadly correspond to the expected theoretical value / * Wh{x). 
Note here that the bias of fnh{x) is equal to the biasofan ordinary kernel 



density estimator based on a sample from /, see e.g. iFanI (jl991al ) 



To gain insight into this striking discrepancy, rec a ll how the asymptotic 



normality of fnh{x) was derived in Ivan Es and Uh I (j2005l ). Adapting the 



proof from the latter paper to our example, the first step is to rewrite fnh{x) 
as 



If 1 

—r / 0io(s) exp[s'^/(/z/i'^)](is— cos 



1 




0.025 0,05 0.075 0.1 0125 0.15 O05 01 015 02 025 03 



Figure 2: The histograms of estimates fnhix) for x = and x = 0.92 for 
the density # 1 (top two graphs), for x = and x = 0.92 for the density 
# 2 (middle two graphs), and for x = and x = 2.04 for the density # 3 
(bottom two graphs). 



where the remainder terms Rn j are defined in Ivan Es and Uhl(|200,^ ). Then 



by estimating the variance of the second summand in ([7]) , one can show that 
it can be neglected when considering the asymptotic normality of ([7]) as 
n ^ oo and h ^ 0. Turning to the first term i n ([71), one uses the asymptotic 
equivalence, cf. Lemma 5 in Ivan Es and Uh I (j2005l ). 



(P^{s) exp[s^/{fih^)]ds ~ Ar{a + 1) (^/i^) 



(8) 



which explains the shape of the normalising constant in Theorem 1 1.2 1 How- 
ever, this is precisely the point which causes a large discrepancy between 
the theoretical standard deviation and the sample standard deviation. The 
approximation is good asymptotically as /i — > 0, but it is quite inaccurate 
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/ 


h 


Ai 




0-1 


<5-2 


a 


a 


# 1 


0.24 


0.343 


0.252 


0.0423 


0.039 


0.429 


0.072 


#2 


0.18 


0.066 


0.389 


0.035 


0.067 


0.169 


0.114 


#3 


0.25 


0.074 


0.159 


0.025 


0.037 


0.512 


0.068 



Table 1: Sample means jli and (I2 and sample standard deviations o"i and 
(T2 evaluated at x = and x = 0.92 (densities # 1 and # 2) and x = and 
X = 2.04 (the density ^ 3) together with the theoretical standard deviation 
a and the corrected theoretical standard deviation a. The bandwidth is 
given by h. 



for larger values of h. Indeed, consider the ratio of the left-hand side of ([8]) 
with the right-hand side. We have plotted this ratio as a function of h for 
h ranging between and 1, see Figure El One sees that the ratio is close to 
1 for extremely small values of h and is quite far from 1 for larger values of 
h. It is equally easy to see that the poor approximation in ([H]) holds true for 
kernels ([3]) and ([5]) as well, see e.g. Figure El which plots the ratio of both 
sides of dSl) for the kernel ([31). This poor approximation, of course, is not 




Figure 3: Accuracy of ([8]) as a function of h for the kernels (jll) (left figure) 
and dSl) (right figure). 



characteristic of only the particular /i and A that we used in our simulations, 
but also holds true for other values of /i and A. 

Obviously, one can correct for the poor approximation of the sample 
standard deviation by the theoretical standard deviation by using the left- 
hand side of ^ instead of its approximation. The theoretical standard 
deviation corrected in such a way is given in the last column of Table [H As 
it can be seen from the table, this procedure led to an improvement of the 
agreement between the theoretical standard deviation and its sample coun- 
terpart for all three target densities. Nevertheless, the match is not entirely 
satisfactory, since the corrected theoretical standard deviation and the sam- 
ple standard deviation differ by factor 2 or even more. A perfect match is 
impossible to obtain, because we neglect the remainder term in ^ and h is 
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still fairly large. We further notice that the concurrence between the results 
is better for x = than for x = 0.92 for densities # 1 and 7^ 2, and for 
X = 2.04 than for x = for the density # 3. We also performed simulations 
for the sample sizes n = 100 and n = 200 to check the effect of having larger 
samples. For brevity we will report only the results for density ^ 2, see 
Figure S] and Table [21 since this density is nontrivial to deconvolve, though 
not as difficult as the density ^ 3. Notice that the results did not improve 
greatly for n = 100, while for the case n = 200 the corrected theoretical 
standard deviation became a worse estimate of the sample standard devi- 
ation than the theoretical standard deviation. Explanation of this curious 
phenomenon is given in Section [3j 




Figure 4: The histograms of estimates fnh{x) for x = and x = 0.92 for 
the density ^ 2 for n = 100 (top two graphs) and for n = 200 (bottom two 
graphs) . 



n 


h 






<5-i 


0-2 


a 


a 


# 100 


0.17 


0.063 


0.393 


0.025 


0.051 


0.108 


0.090 


# 200 


0.15 


0.052 


0.402 


0.023 


0.049 


0.070 


0.084 



Table 2: Sample means jli and jl2 and sample standard deviations ai and 
a2 evaluated at x = and x = 0.92 for the density # 2, together with 
the theoretical standard deviation a and the corrected theoretical standard 
deviation a. 
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Furthermore, note that 



Var 



1 " 

— T 



X — Xj 

cos ' 



n 



as n 



oo and /i ^ 0, see van Es and Uh ( 20051 ). This explains the ap- 



pearance of the factor 1/2 in the asymptotic variance in Theorem 11.21 One 
might also question the goodness of this approximation and propose to use 
instead some estimator of Var[cos((x — e.g. its empirical counter- 

part based on the sample Xi, . . . However, in the simulations that we 
performed for all three target densities (with n and h as above), the result- 
ing estimates took values close to the true value 1/2. E.g. for the density # 
3 the sample mean turned out to be 0.502298, while the sample standard 
deviation was equal to 0.0535049, thus showing that there was insignificant 
variability around 1/2 in this particular example. On the other hand, for 
other distributions and for different sample sizes, it could be the case that 
the direct use of 1/2 will lead to inaccurate results. 

Next we report some simulation results relevant to Theorem 11.11 This 
theorem tells us that for a fixed n we have that 



\fnh{x)-mnh{x)]) (9) 

is approximately normally distributed with zero mean and variance equal to 
one. Upon using the fact that E [fnhix)] = f *Wh{x), we used the data that 
we obtained from our previous simulation examples to plot the histograms of 
([9]) and to evaluate the sample means and standard deviations, see Figure [5] 
and Table [3l One notices that the concurrence of the theoretical and sample 
values is quite good for the density ^ 1. For the density 7^ 2 it is rather 
unsatisfactory for x = 0, which is explainable by the fact that in general 
there are very few observations originating from the neighbourhood of this 
point. Finally, we notice that the match is reasonably good for the density 
^ 3, given the fact that it is difficult to estimate, at the point x = 2.04, but 
is still unsatisfactory at the point a; = 0. The latter is explainable by the 
fact that there are less observations originating from the neighbourhood of 
this point. An increase in the sample size {n = 100 and n = 200) leads to 
an improvement of the match between the theoretical and the sample mean 
and standard deviation at the point x = for the density ^ 2, see Figure 
[Hand Table m however the results are still largely inaccurate for this point. 
In essence similar conclusions were obtained for the density # 3. These are 
not reported here. 

Note that in all three models that we studied the noise level is not 
high. We also studied the case when the noise level is very high. For 
brevity we present the results only for the density # 1 and for sample size 
n = 50. We considered three cases of the error distribution: in the first 
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I 



-200 -150 -100 -50 



Figure 5: The histograms of Q for x = and x = 0.92 for the density # 
1 (top two graphs), for a; = and x = 0.92 for the density # 2 (middle 
two graphs), and for x = and x = 2.04 for the density # 3 (bottom two 
graphs) . 



case Z ~ AA(0, 1), in the second case Z ~ A/"(0, 2^) and in the third case 
Z ~ 7^(0,42). Notice that the NSR is equal to 100%, 400% and 1600%, 
respectively. The simulation results are summarised in Figures [7] and [8] 
and Tables [5] and [6l We see that the sample standard deviation and the 
corrected theoretical standard deviation are in better agreement among each 
other compared to the low noise level case. Also the histograms of the 
values of ([9]) look better. On the other hand the resulting curves fnh were 
not too satisfactory when compared to the true density / in the two cases 
Z ~ AA(0, 1), and Z ~ AA(0,2^) (especially in the second case) and were 
totally unacceptable in the case Z ~ J\f{0,4?). This of course does not 
imply that the estimator ([T|) is bad, rather the deconvolution problem is 
very difficult in these cases. 
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/ 


h 




A2 




0-2 


# 1 


0.24 


-0.046 


-0.093 


0.953 


1.127 


#2 


0.18 


-3.984 


-0.084 


17.2 


1.28 


#3 


0.25 


-0.768 


-0.141 


4.03 


1.63 



Table 3: Sample means fti and fi2 and sample standard deviations o"i and 
(T2 evaluated at x = and x = 0.92 (densities # 1 and # 2) and x = and 
X = 2.04 (the density # 3). 



I 



Figure 6: The histograms of ([9]) for x = and x = 0.92 for the density ^ 2 
for n = 100 (top two graphs)and n = 200 (bottom two graphs). 



Finally, we mention that results qualitatively similar to the ones pre- 
sented in this section were obtained for the kernel ([3]) as well. These are not 
reported here because of space restrictions. 

3 Discussion 

In the simulation examples considered in Section [2] for Theorem 11.21 we 
notice that the corrected theoretical asymptotic standard deviation is always 
considerably larger than the sample standard deviation given the fact that 
the noise level is not high. We conjecture, that this might be true for the 
densities other than # 1, 7^ 2 and 7^ 3 as well in case when the noise level is 
low. This possibly is one more explanation of the fact of a reasonably good 
performance of deconvolution kernel density estimators in the supersmooth 
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n 


h 


Ai 






0'2 


100 


0.17 


-1.33 


-0.015 


9.89 


1.31 


200 


0.15 


-1.02 


-0.015 


6.36 


1.58 



Table 4: Sample means jli and ji2 and sample standard deviations o"! and 
<5"2 evaluated at x = and x = 0.92 for the density # 2 for two sample sizes: 
n = 100 and n = 200. 



NSR 


h 


Ai 




0-1 


<5-2 


a 


a 


100% 


0.36 


0.294 


0.236 


0.046 


0.045 


0.057 


0.075 


400% 


0.59 


0.214 


0.189 


0.053 


0.053 


0.046 


0.076 


1600% 


0.89 


0.150 


0.156 


0.279 


0.289 


0.251 


0.342 



Table 5: Sample means jli and ji2 and sample standard deviations fii and 
(72 together with theoretical standard deviation a and corrected theoretical 
standard deviation a evaluated at x = and x = 0.92 for the density 7^ 1 
for three noise levels: NSR = 100%, NSR = 400% and NSR = 1600%. 



error case for relatively small sample sizes which was noted in I Wand 
On the other hand the match between the sample standard deviation and 
the corrected theoretical standard deviation is much better for higher levels 
of noise. These observations suggest studying the asymptotic distribution 
of the deconvo lution kernel de nsity estimator under the assumption cr ^ 
as n — > 00, cf. iDelaigle I (|2007l ). where a denotes the standard deviation of 
the noise term. 

Our simulation examples suggest that the asymptotic standard deviation 
evaluated via Theorem 11.21 in general will not lead to an accurate approx- 
imation of the sample standard deviation, unless the bandwidth is small 
enough, which implies that the corresponding sample size must be rather 
large. The latter is hardly ever the case in practice. On the other hand, 
we have seen that in certain cases this poor approximation can be improved 
by using the left-hand side of dH]) instead of the right-hand side. A perfect 



NSR 


h 




A2 




0'2 


100% 


0.36 


-0.038 


-0.098 


1.091 


1.228 


400% 


0.59 


-0.079 


-0.134 


1.155 


1.193 


1600% 


0.89 


-0.015 


0.035 


1.027 


1.086 



Table 6: Sample means fii and ji2 and sample standard deviations ai and 
<5"2 of ([9]) evaluated at x = and x = 0.92 for the density 7^ 1 for two noise 
levels: NSR = 400% and NSR = 1600%. 
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-0.5 -025 0.25 0.5 0.75 



Figure 7: The histograms of fnh{x) for x = and x = 0.92 for the density 
# 1 for n = 50 and three noise levels: NSR = 100% (top two graphs), 
NSR = 400% (middle two graphs) and NSR = 1600% (bottom two graphs). 



match is impossible to obtain given that we still neglect the remainder term 
in ([7]). However, even after the correction step, the corrected theoretical 
standard deviation still differs from the sample standard deviation consid- 
erably for small sample sizes and lower levels of noise. Moreover, in some 
cases the corrected theoretical standard deviation is even farther from the 
sample standard deviation than the original uncorrected version. The latter 
fact can be explained as follows: 

1. It seems that both the theoretical and corrected theoretical standard 
deviation overestimate the sample standard deviation. 

2. The value of the bandwidth h, for which the match between the cor- 
rected theoretical standard deviation and the sample standard devi- 
ation become worse, belongs to the range where the corrected theo- 
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I 



m n 



Figure 8: The histograms of ^ for x = and x = 0.92 for the density 
# 1 for n = 50 and three noise levels: NSR = 400% (top two graphs), 
NSR = 400% (middle two graphs) and NSR = 1600% (bottom two graphs). 



retical standard deviation is larger than the theoretical standard de- 
viation. In view of item 1 above, it is not surprising that in this case 
the theoretical value turns out to be closer to the sample standard 
deviation than the corrected theoretical value. 

The consequence of the above observations is that a naive attempt to 
directly use Theorem 11.21 e.g. in the construction of pointwise confidence 
intervals, will lead to largely inaccurate results. An indication of how large 
the contribution of the remainder term in ([7]) can be can be obtained only 
after a thorough simulation study for various distributions and sample sizes, 
a goal which is not pursued in the present note. From the three simula- 
tion examples that we considered, it appears that the contribution of the 
remainder term in d?]) is quite noticeable for small sample sizes. For now 
we would advise to use Theorem 11.21 for small sample sizes and lower noise 
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levels with caution. It seems that the similar cautious approach is needed 
in case of Theorem 11.11 as well, at least for so me values of x. 

Unlike for the ordinary smooth case, see Bissantz et al. ( 20071 ). there 



is no study dealing with the construction of uniform confidence intervals 
in the supersmooth case. In the latter paper a better performance of the 
bootstrap confidence intervals was demonstrated in the ordinary smooth case 
compared to the asymptotic confidence bands obtained from the expression 
for the asymptotic variance in the central limit theorem. The main difficulty 
in the supersmooth case is that the asymptotic distribution of the supremum 
distance between the estimator fnh and the true density / is unknown. Our 
simulation results seem to indicate that the bootstrap approach is more 
promising for the construction of pointwise confidence intervals than e.g. 
the direct use of Theorems 11.11 or 11.21 Moreover, the simulations suggest 
that at least Theorem 11.21 is not appropriate when the noise level is low. 
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